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ABSTRACT 

We propose a novel framework for studying causal infer- 
ence of gene interactions using a combination of compressive 
sensing and Granger causality techniques. The gist of the 
approach is to discover sparse linear dependencies between 
time series of gene expressions via a Granger-type elimina- 
tion method. The method is tested on the Gardner dataset 
for the SOS network in E. coli, for which both known and 
unknown causal relationships are discovered. 

Index Terms — Compressive sensing, Gene Expression, 
Granger Causality, SOS Network 

1. INTRODUCTION 

One of the focal problems of systems biology is the discovery 
of causal relationships among different components of biolog- 
ical systems. Gene regulatory networks, protein-protein inter- 
action networks, chemical signaling, and metabolic networks 
all exhibit causal relationships between their agents that are 
crucial for proper functionality. Discovering such causal re- 
lationships through experiments may be a challenging task 
due to the technical precision required from the experiments 
and due to the large number of interconnected and dynam- 
ically varying components of the system. It is therefore of 
great importance to develop an analytical framework for dis- 
covering causal connections between genes and for elucidat- 
ing the gene interactome, based on limited experimental data. 
Analytically inferred interactions may consequently be used 
to guide the experimental design process, which would then 
help in further refining the modeling framework. Representa- 
tive work in this direction includes (T]|2). 

One way to detect if a gene causally influences another 
gene is to observe the target gene's expression levels and 
detect if changes in the expressions of the other gene affect 
changes in the expressions of the target. For this purpose, 
a number of authors suggested the use of Bayesian network 
modeling and other machine learning tools, algebraic tech- 
niques, information-theoretic methods, and autoregressive 
Granger causality 0] approaches. 
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We propose a novel method for identifying causal gene 
dependencies based on two ideas: compressive sensing and 
a non-standard version of Granger causality, or elimination 
analysis. The compressive sensing approach is motivated by 
a technique for face recognition used in computer vision, first 
described in [4|. The crux of the approach is to efficiently 
find a sparse linear representation of an image of one individ- 
ual in terms of images of other individuals and the individual 
itself, taken under many different conditions. This setup is 
reminiscent to the one described in f5j[6), where expression 
levels of genes taken under different experimental conditions 
(or, under different gene knockout scenarios) are represented 
as vectors for which a sparse representation is sought. 

The main finding of our analysis indicates that even 
without additional learning steps, one can infer a number of 
causal gene interactions in the E. coli SOS network reported 
by Gardner et al. [7 | using a combination of elimination tech- 
niques and compressive sensing, even when the number of 
points in the time series is very small. Causality is inferred via 
a reduction in the representation residual error and the pres- 
ence of a certain gene as a factor in a sparse representation. 
Granger causality may be incorporated in this framework 
by using time delayed profiles for recognition purposes and 
incorporating them into the sensing matrix as part of an elim- 
ination approach. Unfortunately, the proposed method may 
not yield to improved detection probability upon adding time 
shifted expression profiles, which may be attributed to the fact 
that gene expressions are usually measured at time instances 
that are too widely separated. 

The paper is organized as follows. Compressive sensing 
and Granger causality are briefly introduced in Section[2] The 
proposed causal inference method is described in Section [3j 
while the testing framework and the results pertaining to the 
gene SOS network of E. coli are presented in Section[4] 

2. BACKGROUND 

Compressive sensing (CS) is a technique for efficient sam- 
pling of compressible and if-sparse signals, i.e., signals that 
can be represented by K <C N significant coefficients over 
a ./V-dimensional basis. Sampling of a X-sparse, discrete- 
time signal x of dimension N is accomplished by computing 



a measurement vector y that consists of m <C N linear pro- 
jections, i.e., 

y = *x. (l) 

Here, 3? represents anmxJV matrix, usually over the field 
of real numbers with certain desirable singular value prop- 
erties |8) [9j. One of these properties is known as the re- 
stricted isometry property (RIP), and it is usually considered 
a benchmark condition for practical testing of compressive 
sensing methods. Furthermore, the recent results in [ 8 1 and 
[9 1 showed that a length N signal that is if -sparse can be re- 
covered from only to ~ O (K log (N/K)) linear observations 
of the signal, provided that €> has the RIP. 

Only a few compressive sensing algorithms were suc- 
cessfully integrated into causal inference models [ 1 1. One of 
the simplest causality testing schemes, originally proposed in 
econometrics, is Granger causality. In its original incarna- 
tion, Granger causality was presented as a heuristic statistical 
concept based on prediction. The method has the goal to 
determine if a time series of past observations of a process 
helps to predict the future values of another process. More 
formally, a process X Granger-causes another process Y if 
the future values of Y may be predicted with larger accu- 
racy using the past observations of both Y and X than when 
using only past observations of Y. In the context of linear 
regression, this causality model may be described as follows. 
Let us assume that the value of a process Y at time t may be 
predicted via an autoregressive (linear) rule as 

d 

Y( 1 \t) = J2a i Y(t-iT) + r( 1 \ (2) 

i=l 

where the prediction memory is denoted by d and the sam- 
pling interval by T. The residual error of the predictor equals 
r-W. Now, let 

d d 
YW{t) = Y,a' i Y(t-iT)+J2b i X(t-iT) + r W (3) 

2=1 2 = 1 

be a linear prediction of Y at time t when the past values of 
X are considered in addition to the observation of Y, The 
residual error in this case equals A 2 \ One way to determine 
if X Granger-causes Y is to compare the residual errors: if 
the value of is smaller than the value of A x \ then X 
Granger-causes Y. The word "smaller"' is usually interpreted 
in many different ways, frequently involving constraints other 
than just the difference of the residuals. 

In what follows, we propose to combine the techniques 
captured by (fl} and (O for the purpose of inferring causal re- 
lationships in gene regulatory networks. There are two main 
issues to be addressed in this case: how to discover linear re- 
lationships between expression profiles that may (and usually 
are) correlated with each other and how to adapt the sensing 
matrix $ to perform meaningful Granger-type tests. We dis- 
cuss these two issues in what follows. 



3. THE MODEL 

We start by briefly describing the work [4 1 that illustrates how 
compressive sensing may be used for pattern recognition in 
the presence of noise and outliers. Assume that one wants to 
identify a person, given a particular image of the person, us- 
ing a large database of images of different individuals taken 
under different conditions. One way to perform the identifi- 
cation is to convert each image into a vector, the target vector 
being y, and the database vectors being (p it i — 1, . . . , N, 
where N denotes the number of individuals in the database. 
The sensing matrix equals <fr = [<p 1 ,<p 2 ,--- , <p N ] and recog- 
nition is performed by solving (Q]i for a judiciously chosen 
sparsity level K of the vector x. Then, one can decide on the 
identity of the target individual based on how many of the K 
non-zero components of x correspond to images of the target 
individual. 

Consider the following related scenario: instead of deal- 
ing with images, we focus on expression profiles of N genes 
taken under different experimental conditions. These expres- 
sion time series now represent the columns of the sensing ma- 
trix <p g . , i = 1, . . . , N. Due to the fact that the task at hand 
in gene network analysis is not to identify a gene based on 
its expression, but rather genes that influence its behavior or 
are co-expressed, the setup has to be changed slightly. In this 
case, one should declare one gene of interest to be the tar- 
get gene, with expression profile y, and the goal would be 
to identify correlated genes via (Q]). A similar approach to 
this one was pursued in (6), with the goal of identifying lin- 
ear dependencies between expression vectors based on sparse 
interaction assumptions. Compressive sensing was used only 
as an initial step of a learning procedure, implemented via the 
AdaBoost method. 

In order to incorporate Granger causality into the above 
described framework, let us assume that G denotes the set 
of all possibly interacting genes in a transcription network. 
We perform two compressive sensing tests. First, we find the 
"relevant" genes for target gene j when the expression profile 
of gene i is not present in the sensing matrix. In this case 

y? } = <S> G \ {l , j} xf\ (4) 

where Xj denotes the sparse vector of coefficients, and 
&G\{i,j} is an to x n matrix representing the gene expres- 
sions of all the genes in G except genes i and j, i.e. 

$ G\{i,i} = AlA* ■ ■ > tg^Aw?' ■ -A^AmS ■ ' ' ^flJ- 

Second, we perform one more round of testing for gene j 
when gene i is included in the sensing matrix, 



where 

$ G\{i> = AxA* ■ ■ > tg^Aw}' ' -A^AA^- ■ ■ ' <^J- 



The results of the two experiments may be used for inference 

(2) 

as follows. If the residual error of recovering y\ ' is smaller 

than the residual error of recovering jr 1 , and in addition, 
gene gi was included in the list of K non-zero components 
of the approximation, we conclude that gi causally influences 
gj. We would like to point out that this is one of the simplest 
ways to deduce causal relationships via the proposed method 
- for more precise predictions, one may also record the num- 
ber of perturbations in the corresponding x vector caused by 
including gene g,, the magnitude of the weight of gene gi, 
etc. A full description of these criteria is postponed to be de- 
scribed in the full version of the paper. 

This approach may be taken one step further. Assume that 
S l denotes a right-shift operator by i locations, which when 
applied to a vector [xi, xe+i, . . . ,x?j) produces 

S t (xt,xt+i, ■ ■ ■ ,x N ) = (xe-i,xt, . . ., x N -i). 

If the sensing matrix 3? in the latter of the two above described 
scenarios is chosen as 

[4> gi , S l 4> gi , . . . , S f cj) gi , . . . , cj) gNl S 1 ^ , . . . , s f 4> gN ], 

for some integer / > 1, then the compressive sensing model 
becomes a sparse instant of Granger causality. Although we 
tested this model extensively, the corresponding findings are 
not included in the manuscript. This is due to the fact that in- 
cluding shifts of profiles does not seem to offer practical per- 
formance improvements for the proposed inference method, 
which we attribute to the fact that the size of the sensing ma- 
trix <I> has to be doubled and that the time point measurements 
are taken at instances too apart from each other. The latter 
phenomena clearly does not allow for inferring short-range 
dependencies, usually found in gene regulatory networks. 

Many other extensions of this modeling framework are 
possible - one being a model where the columns of the sens- 
ing matrix represent non-linear functions of pairs of expres- 
sion vectors. This and other models will be discussed in a 
companion paper. 

4. RESULTS 

In order to test the causal compressive sensing method 
outlined in Section 3, we considered expression levels of 
4292 genes of the bacteria E. coli obtained from the KEGG 
database, after removing genes with very few accurate ex- 
pression points. The expression vectors correspond to 22 ex- 
periments with 94 time points for each gene. Consequently, 
the expression profiles may be organized into a 94 x 4292 
real-valued matrix. Each entry in the matrix is normalized as 
follows: the average expression level of a column is computed 
and then subtracted from each element in the column. Hence, 
the normalized matrix contains both positive and negative 
entries. 

The test network of interest was chosen to be the network 
induced by genes of the SOS repair system. Although this 



Table 1. A subnetwork of the SOS network of E. coli: an en- 
try 1 at location indicates that gene j causally regulates 
gene i; indicates that no such regulation is currently known. 
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Table 2. Table of residues obtained via Granger testing, re- 
ported as {res2-resl)lresl (%). The largest changes in the 
residues detected are listed in boldface script. 
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system has at least fifty documented components, we focused 
on nine genes deemed most relevant by the analysis in Q (see 
Table I for the description of the topology of this network). 
One of the key genes in this network, named lexA, is known 
to regulate many genes in the SOS system and was chosen as 
the starting target gene of our analysis. 

Using the setup of eq. (f5]), one in which no gene is ex- 
cluded, we find K — 50 relevant genes for lexA. Since a few 
genes known to be present in the SOS network were not in 
the identified list, we added them back into the pool. The 
reason for performing this step is to reduce the search space 
and to clean up the potential list of candidates for interaction 
partners of lexA. A very large number of genes included in 
the analysis increases the interference level and sensitivity of 
the procedure. To verify that this expurgation procedure is 
meaningful, we computed the p-value of SOS gene inclusion, 
which equals 0.0036. At the end of the procedure, we were 
left with 57 genes. 

We then proceeded to the second stage of identification. 
We used eq. © for a second time, with K = 5, but applied 
the procedure to all 57 genes identified in the first step as tar- 
get genes. We formed a union of genes relevant for each SOS 



Table 3. "U" stands for "unconfirmed" and points to links in 
the Gardner network that were not found by our method;"P" 
stands for "predicted" and corresponds to new links found us- 
ing our method that were not previously reported in Gardner's 
paper; "V" stands for "verified" and refers to links reported in 
Gardner's network and verified by our method. 
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network gene (each set of five relevant genes included at least 
two SOS network genes) and each out-of-network gene. This 
reduced the test set to 34 test genes. These genes were then 
used in the elimination method described in the previous sec- 
tion, i.e., for each of the 34 test genes a Granger test was per- 
formed with respect to the remaining 33 genes. The residuals 
of the reconstruction without and with a given gene are de- 
noted by resl and res2, respectively, and which genes these 
residuals correspond to should be clear from the context. The 
Granger tests are performed for a range of values K = 5 
to K = 20. The residuals obtained for K = 14 are listed 
in Table 2. As already pointed out, the list of relative resid- 
ual changes does not suffice to determine if a gene causally 
influences another gene. One also has to verify that the elim- 
ination gene was identified as relevant when included in the 
compressive sensing process. The results of this combined 
test are presented in Table 3. 

Due to space limitations, we comment only on one re- 
sult presented in Table 3. The result concerns the genes dinl 
and recA. The network of Gardner, published in 2003, did not 
include causal dependencies involving dinl as a regulator of 
recA. Before 2003, it was experimentally verified that recA 
leads to cleavage of a repressor of lexA, but only in the last 
few years did it become apparent that dinl has a strong role in 
regulating recA's ability to promote lexA cleavage. Note that 
this dependency was detected with a very large reduction in 
the residual error, equal to 13%. 

We conclude our exposition with one more interesting 
finding, described in Table 4. There, we listed the "pertur- 
bation" in the list of relevant genes caused by the inclusion of 
a particular gene after its initial exclusion from analysis. The 
perturbation caused by the inclusion of the dinl gene into the 
sensing matrix of the recA is significant: 9 genes (out of 14). 



Table 4. The number of changes in relevant genes induced by 
the elimination process. 
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